A Hybrid ANN-GA Model to Prediction of Bivariate Binary Responses: Application to Joint Prediction of Occurrence of Heart Block and Death in Patients with Myocardial Infarction

Background: In medical studies, when the joint prediction about occurrence of two events should be anticipated, a statistical bivariate model is used. Due to the limitations of usual statistical models, other methods such as Artificial Neural Network (ANN) and hybrid models could be used. In this paper, we propose a hybrid Artificial Neural Network-Genetic Algorithm (ANN-GA) model to prediction the occurrence of heart block and death in myocardial infarction (MI) patients simultaneously. Methods: For fitting and comparing the models, 263 new patients with definite diagnosis of MI hospitalized in Cardiology Ward of Hajar Hospital, Shahrekord, Iran, from March, 2014 to March, 2016 were enrolled. Occurrence of heart block and death were employed as bivariate binary outcomes. Bivariate Logistic Regression (BLR), ANN and hybrid ANN-GA models were fitted to data. Prediction accuracy was used to compare the models. The codes were written in Matlab 2013a and Zelig package in R3.2.2. Results: The prediction accuracy of BLR, ANN and hybrid ANN-GA models was obtained 77.7%, 83.69% and 93.85% for the training and 78.48%, 84.81% and 96.2% for the test data, respectively. In both training and test data set, hybrid ANN-GA model had better accuracy. Conclusions: ANN model could be a suitable alternative for modeling and predicting bivariate binary responses when the presuppositions of statistical models are not met in actual data. In addition, using optimization methods, such as hybrid ANN-GA model, could improve precision of ANN model.


Introduction
he joint occurrence of events correlated with each other is always considered by researchers in medical sciences. In classical statistics, when the aim is joint prediction of two events (or response variables) somehow correlated, bivariate models are used. When both response variables are qualitative, bivariate logistic regression model is used.
Usually, traditional statistical models are based on some certain presuppositions such as specified distribution of response variables, linear relationship among dependent and independent variables, and equality of variance in errors that may not be true in actual data 1 .
Artificial Neural Network (ANN) could be an alternative method versus classical statistical models, which does not require the mentioned presuppositions of classic models and can be easily fitted for linear and nonlinear relationships 2 . Multi-layer perceptron (MLP) is the most commonly used form of ANN. Learning in MLP is done based on backpropagation (BP) algorithm by minimizing the sum of squared errors 3 .
For better efficiency of ANN model, it is necessary to optimize the parameters of model such as initial value of weights. Genetic algorithm (GA) is one of the optimization methods in ANN models 4 . GA as a technique for optimization based on Darwin theory of evolution "Survival of fittest" was first developed by John Holland. Basic operations in GA are reproduction, crossover and mutation. By combining ANN and GA, we can expect more accurate results 5 . Flowchart of a typical genetic algorithm is shown in Figure 1 6 . Acute myocardial infarction (AMI) is referred to the constant and irrevocable cell death in a part of myocardium, which is due to the loss of blood flow and occurrence of a severe ischemia in that part. Despite wide diagnostic advancements, nearly 33% of the patients with myocardial infarction (MI) die and 5%-10% of the survived patients die within the first year after MI. There are approximately 1,500,000 AMI patients and about 25% of the mortality is attributed to this disease in USA 7 . Iran Ministry of Health and Medical Education has reported that about 39% of total mortality in Iran is due to cardiovascular diseases 8,9 . Over 60% of the deaths due to MI are happen within one hour after T MI and most of them are caused by arrhythmias, with ventricular fibrillation and bundle branch block as two prevalent types. Nowadays, MI is the most common cause of death in many communities and is associated in hospitals with several complications such as atrioventricular node block and bundle branch block. According to WHO report, AMI is the leading cause of mortality in the world, particularly Iran cardiac arrhythmias are the most prevalent reason for death from AMI 10 . Heart blocks are an important class of arrhythmias and lead to prolonged hospitalization and increased in-hospital mortality. Therefore, they attract attention 10 . Because medical studies are related to human health, therefore, precise and accurate predictions are of great importance in these studies. Due to the limitations of traditional statistical methods in modeling bivariate responses, in this paper, we made an attempt to introduce a new approach with fewer restrictions based on a hybrid ANN-GA method to modeling and predicting bivariate binary responses and using this model to prediction of occurrence of heart block and death in MI patients simultaneously. We also compared prediction accuracy of this model with BLR and ANN models.

Methods
To evaluate the suitability of the proposed model compare with traditional methods for modeling and predicting bivariate binary responses, we used data from a crosssectional study. In this study, 263 new patients with definite diagnosis of MI hospitalized in Cardiology Ward of Hajar Hospital, Shahrekord, Iran, from March, 2014 to March, 2016 were enrolled. The diagnosis of MI was done according to the WHO criteria by a cardiologist per International Classification of Diseases (ICD10: the codes I24.9, I25.2, I22, and I21). Demographic characteristics and clinical history of the patients were gathered by a checklist at the time of admission.
For fitting of ANN model, the training and test data set were used as with the bivariate logistic regression. Since, in this research, the outcome is bivariate, so, assuming p input nodes, where p is the number of covariates, 1 hidden layer, M nodes in hidden layer and 2 nodes in output layer, the ANN architecture can be written as: where wjs is the weight for input xis at the hidden node j. Also, βj is the weight dependent to the hidden node j, and wj0 and β0 are the biases for the hidden and the output nodes respectively. The function Ψh is activation functions of hidden layer and the function Ψo is activation functions of output layer 2 .
We fitted MLP with one hidden layer, including 8-14 nodes. To identify the number of nodes in hidden layer, mean square error (MSE) criterion was used. Sigmoid activation function was considered for hidden and output layers. Several training algorithms including gradient descent (GD), gradient descent momentum (GDM), conjugate gradient algorithm (CGA), scaled conjugate gradient (SCG), Broyden-Fletcher-Goldfarb-Shanno (BFGS), one step secant (OSS) and Levenbery-Marqwardt (LM) were used for training. All these algorithms are from BP algorithm family 14 .

JRHS 2016; 16(4):190-194
After determining the final architecture of ANN model and select the best training algorithm, genetic algorithm was used optimize initial weights in ANN model and hybrid ANN-GA model was fitted to data. Figure 2 shows the stages of implementation of proposed hybrid model to optimize the initial values of the weights in ANN by genetic algorithm. The prediction in the bivariate models was considered correct, when both y1 and y2 variables are predicted correctly by models. Prediction accuracy was used for evaluating the models. This criterion was defined as percentage of correct joint prediction of the two binary outcomes. To implement the models, Matlab 2013a for ANN and ANN-GA models and Zelig package in R3.2.2 for bivariate logistic regression model were used 13 .

Results
Of the 263 samples, 221 people (84.0%) had experienced heart block that (6.3%) of them died and 42 people (15.9%) had not experienced heart block that (19.0%) of them died. Correlation between two outcome variables was significant (P=0.006). Tables 1 and 2 present the descriptive information of general characteristics of patients.   The results of the bivariate logistic regression model for significant independent variables are shown in Table 3. Age, level of troponin and history of heart disease were significant variables in bivariate model. Prediction accuracy of ANN model with different training algorithms for training and test data set is presented in Table 4. Among different training algorithms in ANN model, LM algorithm had the highest performance.   Table 5 compares prediction accuracy of hybrid ANN-GA model against BLR and ANN models. In both training and test data set, hybrid ANN-GA model had better accuracy compared with other models.

Discussion
In this paper, we proposed a new approach based on a hybrid ANN-GA model to joint prediction of bivariate dependent binary outcomes. We compared prediction accuracy of this model with other traditional models for joint prediction of occurrence of heart block and death in MI patients. Results showed that proposed hybrid ANN-GA model had better performance compared with BLR and ANN models. Better performance of ANN model compared to classic models has been confirmed already [14][15][16] . Because the ANN model lacks many of limitations of classic models, in many situations, it can be a suitable alternative for these models when some (or all) of their conditions are not met in the analysis of actual data 15 . Besides, results of this study showed that hybrid ANN-GA model, because of optimization of parameters of ANN model, can improve precision of ANN model.
Despite the benefits of ANN and hybrid ANN-GA models, these methods suffer from some limitations and problems. For example, in these models, statistical inference for parameters and checking significant relationship between dependent and independent variables are not possible, because, the distribution of the parameters is not specified in ANN and hybrid models 15 .
ANN and hybrid models are more appropriate when priority is prediction of dependent variables, or data have a nonlinear and complex structure. If the primary aim is to explain a clear association among dependent and independent variables and to study the effect of independent variables on dependent variables, then classic models such as logistic regression model is preferable 17 .
Given the limitations of conventional statistical methods for modeling bivariate responses in actual data, using the proposed method in the present study is also recommended for similar problems.

Conclusions
Hybrid ANN-GA model is the best for prediction of heart block and death simultaneously in MI patients compared with ANN and BLR models, so, considering the importance of accurate prediction in medical studies and due to the limitations of classical statistical methods for modeling bivariate responses, the use of NN and hybrid ANN-GA models is a suitable alternative for analysis of bivariate binary responses.